생성 모델링 개론: 분류를 넘어서기

우리는 분류 모델링이라는 조건부 확률 $P(y|x)$을 학습하여 분류 및 회귀 문제를 해결하는 방식에서, 더 정교한 생성 모델링으로 전환하고 있습니다. 이제 우리의 핵심 목표는 밀도 추정입니다. 즉, 데이터의 전체 기초 분포 $P(x)$를 직접 학습하는 것입니다. 이 근본적인 변화는 고차원 데이터셋 내에 존재하는 복잡한 상관관계와 구조를 포착할 수 있게 하며, 단순한 경계 분리에서 벗어나 진정한 데이터 이해와 생성으로 나아갈 수 있게 해줍니다.

1. 생성 모델의 목적: $P(x)$ 모델링

생성 모델의 목적은 훈련 데이터 $X$가 유래된 확률 분포 $P(x)$를 추정하는 것입니다. 성공적인 생성 모델은 세 가지 중요한 작업을 수행할 수 있습니다: (1) 밀도 추정(입력값 $x$에 확률 점수를 부여하기), (2) 샘플링(완전히 새로운 데이터 포인트 $x_{new} \sim P(x)$ 생성하기), 그리고 (3) 비지도 특징 학습(잠재 공간 내에서 의미 있는, 분리된 표현 발견하기).

2. 분류: 명시적 대 임시적 가능도

생성 모델은 기본적으로 가능도 함수에 대한 접근 방식에 따라 분류됩니다. 명시적 밀도 모델예를 들어 변분 오토인코더(VAEs) 과 흐름 모델은 수학적 가능도 함수를 정의하고 이를 최대화하려 시도합니다(또는 그 하한선을 최대화). 임시적 밀도 모델중에서도 특히 생성 적대 네트워크(GANs)은 가능도 계산 자체를 완전히 생략하고, 적대적 훈련 프레임워크를 사용해 $P(x)$ 분포에서 샘플링할 수 있는 매핑 함수를 학습합니다.

Data Synthesis and Feature Interpolation

Generative models demonstrate their capability by generating novel, high-fidelity instances (e.g., unseen faces, complex textures) or by allowing semantic interpolation in the learned latent space, illustrating the model's grasp of data variability.

Examples of AI-generated faces and interpolated features.

Question 1

In generative modeling, what is the primary distribution of interest?

$P(x)$

$P(y|x)$

$P(x|y)$

$P(y)$

Question 2

Which type of generative model relies on adversarial training and avoids defining an explicit likelihood function?

Variational Autoencoder (VAE)

Autoregressive Model

Generative Adversarial Network (GAN)

Gaussian Mixture Model (GMM)

Challenge: Anomaly Detection

Leveraging Density Estimation

A financial institution has trained an explicit density generative model $G$ on millions of legitimate transaction records. A new transaction $x_{new}$ arrives.

Goal: Determine if $x_{new}$ is an anomaly (fraud).

Step 1

Based on the density estimate of $P(x)$, what statistical measure must be evaluated for $x_{new}$ to flag it as anomalous?

Solution:
The model must evaluate the probability (or likelihood) $P(x_{new})$. If $P(x_{new})$ falls below a predefined threshold $\tau$, meaning the new point is statistically improbable under the learned distribution of normal transactions, it is flagged as an anomaly.